TimeML-strict: clarifying temporal annotation
نویسندگان
چکیده
TimeML is an XML-based schema for annotating temporal information over discourse. The standard has been used to annotate a variety of resources and is followed by a number of tools, the creation of which constitute hundreds of thousands of man-hours of research work. However, the current state of resources is such that many are not valid, or do not produce valid output, or contain ambiguous or custom additions and removals. Difficulties arising from these variances were highlighted in the TempEval-3 exercise, which included its own extra stipulations over conventional TimeML as a response. To unify the state of current resources, and to make progress toward easy adoption of its current incarnation ISO-TimeML, this paper introduces TimeML-strict: a valid, unambiguous, and easy-to-process subset of TimeML. We also introduce three resources – a schema for TimeML-strict; a validator tool for TimeML-strict, so that one may ensure documents are in the correct form; and a repair tool that corrects common invalidating errors and adds disambiguating markup in order to convert documents from the laxer TimeML standard to TimeML-strict.
منابع مشابه
ClearTK-TimeML: A minimalist approach to TempEval 2013
The ClearTK-TimeML submission to TempEval 2013 competed in all English tasks: identifying events, identifying times, and identifying temporal relations. The system is a pipeline of machine-learning models, each with a small set of features from a simple morpho-syntactic annotation pipeline, and where temporal relations are only predicted for a small set of syntactic constructions and relation t...
متن کاملMultilinguality in Temporal Annotation: A Case of Korean
The aim of this paper is to apply TimeML, an annotation scheme for events and temporal expressions, to the annotation of Korean in an attempt to test its multilingual extendability. TimeML has been well validated by the successful annotation of a corpus of 186 news articles and some other documents in English. One of its remaining tasks, however, is multilingual extension. This paper aims at co...
متن کاملTense and Time Annotations : a Contribution to TimeML Improvement (Annotation de la temporalité en corpus : contribution à l'amélioration de la norme TimeML) [in French]
This paper reports a critical analysis of the TimeML standard, in the light of a temporal annotation that was conducted on spoken French. It shows that the norm suffers from weaknesses that must be corrected to fit the needs of NLP and corpus linguistics. These limitations concern mainly 1) the separation of different levels of linguistic annotation, 2) the delimitation in the text of the event...
متن کاملTimeML-Compliant Analysis of Text Documents
Reasoning with temporal information1 requires a representation of time considerably more involved than just a list of temporal expressions—which typically define the extent of current time extraction efforts. TimeML is an emerging standard for temporal annotation, defining a language for expressing properties and relationships among timedenoting expressions and events in free text. This paper t...
متن کاملAnnotating Events, Temporal Expressions and Relations in Italian: the It-Timeml Experience for the Ita-TimeBank
This paper presents the annotation guidelines and specifications which have been developed for the creation of the Italian TimeBank, a language resource composed of two corpora manually annotated with temporal and event information. In particular, the adaptation of the TimeML scheme to Italian is described, and a special attention is given to the methodology used for the realization of the anno...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1304.7289 شماره
صفحات -
تاریخ انتشار 2013